88 research outputs found

    Fusion architectures for automatic subject indexing under concept drift:Analysis and empirical results on short texts

    Get PDF
    Indexing documents with controlled vocabularies enables a wealth of semantic applications for digital libraries. Due to the rapid growth of scientific publications, machine learning-based methods are required that assign subject descriptors automatically. While stability of generative processes behind the underlying data is often assumed tacitly, it is being violated in practice. Addressing this problem, this article studies explicit and implicit concept drift, that is, settings with new descriptor terms and new types of documents, respectively. First, the existence of concept drift in automatic subject indexing is discussed in detail and demonstrated by example. Subsequently, architectures for automatic indexing are analyzed in this regard, highlighting individual strengths and weaknesses. The results of the theoretical analysis justify research on fusion of different indexing approaches with special consideration on information sharing among descriptors. Experimental results on titles and author keywords in the domain of economics underline the relevance of the fusion methodology, especially under concept drift. Fusion approaches outperformed non-fusion strategies on the tested data sets, which comprised shifts in priors of descriptors as well as covariates. These findings can help researchers and practitioners in digital libraries to choose appropriate methods for automatic subject indexing, as is finally shown by a recent case study

    Automatische Indexierung auf Basis von Titeln und Autoren-Keywords – ein Werkstattbericht

    Get PDF
    Automatische Verfahren sind fĂŒr Bibliotheken essentiell, um die Erschliessung stetig wachsender Datenmengen zu stemmen. Die Deutsche Zentralbibliothek fĂŒr Wirtschaftswissenschaften – Leibniz-Informationszentrum Wirtschaft sammelt seit LĂ€ngerem Erfahrungen im Bereich automatischer Indexierung und baut hier eigene Kompetenzen auf. Aufgrund rechtlicher Restriktionen werden unter anderem AnsĂ€tze untersucht, die ohne Volltextnutzung arbeiten. Dieser Beitrag gibt einen Einblick in ein laufendes Teilprojekt, das unter Verwendung von Titeln und Autoren-Keywords auf eine Nachnormierung der inhaltsbeschreibenden Metadaten auf den Standard-Thesaurus Wirtschaft (STW) abzielt. Wir erlĂ€utern den Hintergrund der Arbeit, betrachten die Systemarchitektur und stellen erste vielversprechende Ergebnisse eines dokumentenorientierten Verfahrens vor.Automatic systems are indispensable for libraries in order to make the rapidly growing number of publications accessible to their users. In the past the ZBW – German National Library of Economics – Leibniz Information Centre for Economics has gained practical experience in this field. Due to legal constraints it currently investigates methods that solely use author generated descriptive metadata. This article gives an insight into on-going developments and relates them to past activities. We report on a promising document-oriented approach, which uses author keywords and titles in combination to automatically assign subject headings from the STW Thesaurus for Economics to a document

    Funktion und NormativitÀt bei Darwin und Aristoteles

    Get PDF
    In immer mehr Bereichen der modernen Humanwissenschaften wird die Evolutionstheorie als maßgebliches ErklĂ€rungsmodell angewendet. Die AttraktivitĂ€t dieses Modells fĂŒr andere Wissenschaften besteht in der Verbindung der historischen Entwicklungsdimension mit einer naturwissenschaftlich-nĂŒchternen Betrachtungsweise, in der PhĂ€nomene funktional als auf Anpassung ausgerichtete ZusammenhĂ€nge begriffen werden. Mit Blick auf die Moral scheint dies jedoch auf die Alternative hinauszulaufen, entweder die Moral auf evolutionĂ€re Anpassungsleistungen zurĂŒckzufĂŒhren oder moralische NormativitĂ€t als irreduzibel anzukennen, was dann aber dazu fĂŒhrt, dass Ethik und Evolutionstheorie als zwei Perspektiven, unter denen menschliches Verhalten betrachtet werden kann, unvermittelt nebeneinander stehen. Dieser Befund ist jedoch insofern defizitĂ€r, als in der gegenwĂ€rtigen philosophischen Ethik in zunehmendem Maße die Notwendigkeit gesehen wird, menschliches Verhalten an die Natur zurĂŒckzubinden. Daher wird im modernen ethischen Diskurs nicht nur die VernĂŒnftigkeit des Menschen, sondern auch seine biologische Verfassung zum Gegenstand der Untersuchung. Dieser Perspektivwechsel ist maßgeblich mit einer Wiederaufnahme aristotelischer Grundfiguren verbunden, so dass sogar von einer ‚Re-Aristotelisierung der praktischen Philosophie‘ (Höffe) gesprochen wird. Was diese antike Ethik so attraktiv erscheinen lĂ€sst, ist die Verwendung eines Naturbegriffs (physis), der zugleich die Grundlage der aristotelischen Naturphilosophie darstellt. Es scheint somit, dass die aristotelische Philosophie die Möglichkeit eröffnet, den Menschen sowohl als natĂŒrliches als auch als moralisches Wesen zu begreifen, ohne die moralische Dimension auf NaturvorgĂ€nge zu reduzieren. Der Sammelband diskutiert unter anderem die Frage, inwieweit Aristoteles’ Philosophie mögliche LösungsansĂ€tze fĂŒr das oben beschriebene Problem der DisparitĂ€t von Moral und Evolutionstheorie anbieten kann

    Pain as a First Manifestation of Paraneoplastic Neuropathies: A Systematic Review and Meta-Analysis.

    Get PDF
    INTRODUCTION: Paraneoplastic neurological syndromes (PNS) consist of a heterogeneous group of neurological disorders triggered by cancer. The aim of this systematic review is to estimate the reported prevalence of pain in patients with paraneoplastic peripheral neuropathy (PPN). METHODS: A systematic computer-based literature search was conducted on PubMed database. RESULTS: Our search strategy resulted in the identification of 126 articles. After the eligibility assessment, 45 papers met the inclusion criteria. Full clinical and neurophysiological data were further extracted and involved 92 patients with PPN (54.5% males, mean age 60.0 ± 12.2 years). The commonest first manifestation of PPN is sensory loss (67.4%), followed by pain (41.3%), weakness (22.8%), and sensory ataxia (20.7%). In 13.0% of the cases, pain was the sole first manifestation of the PPN. During the course of the PPN, 57.6% of the patients may experience pain secondary to the neuropathy. CONCLUSIONS: Pain is very prevalent within PPN. Pain specialists should be aware of this. Detailed history-taking, full clinical examination, and requesting nerve conduction studies might lead to an earlier diagnosis of an underlying malignancy

    Insect pathogens as biological control agents: back to the future

    Get PDF
    The development and use of entomopathogens as classical, conservation and augmentative biological control agents have included a number of successes and some setbacks in the past 15 years. In this forum paper we present current information on development, use and future directions of insect-specific viruses, bacteria, fungi and nematodes as components of integrated pest management strategies for control of arthropod pests of crops, forests, urban habitats, and insects of medical and veterinary importance. Insect pathogenic viruses are a fruitful source of MCAs, particularly for the control of lepidopteran pests. Most research is focused on the baculoviruses, important pathogens of some globally important pests for which control has become difficult due to either pesticide resistance or pressure to reduce pesticide residues. Baculoviruses are accepted as safe, readily mass produced, highly pathogenic and easily formulated and applied control agents. New baculovirus products are appearing in many countries and gaining an increased market share. However, the absence of a practical in vitro mass production system, generally higher production costs, limited post application persistence, slow rate of kill and high host specificity currently contribute to restricted use in pest control. Overcoming these limitations are key research areas for which progress could open up use of insect viruses to much larger markets. A small number of entomopathogenic bacteria have been commercially developed for control of insect pests. These include several Bacillus thuringiensis sub-species, Lysinibacillus (Bacillus) sphaericus, Paenibacillus spp. and Serratia entomophila. B. thuringiensis sub-species kurstaki is the most widely used for control of pest insects of crops and forests, and B. thuringiensis sub-species israelensis and L. sphaericus are the primary pathogens used for medically important pests including dipteran vectors,. These pathogens combine the advantages of chemical pesticides and microbial control agents (MCAs): they are fast acting, easy to produce at a relatively low cost, easy to formulate, have a long shelf life and allow delivery using conventional application equipment and systemics (i.e. in transgenic plants). Unlike broad spectrum chemical pesticides, B. thuringiensis toxins are selective and negative environmental impact is very limited. Of the several commercially produced MCAs, B. thuringiensis (Bt) has more than 50% of market share. Extensive research, particularly on the molecular mode of action of Bt toxins, has been conducted over the past two decades. The Bt genes used in insect-resistant transgenic crops belong to the Cry and vegetative insecticidal protein families of toxins. Bt has been highly efficacious in pest management of corn and cotton, drastically reducing the amount of broad spectrum chemical insecticides used while being safe for consumers and non-target organisms. Despite successes, the adoption of Bt crops has not been without controversy. Although there is a lack of scientific evidence regarding their detrimental effects, this controversy has created the widespread perception in some quarters that Bt crops are dangerous for the environment. In addition to discovery of more efficacious isolates and toxins, an increase in the use of Bt products and transgenes will rely on innovations in formulation, better delivery systems and ultimately, wider public acceptance of transgenic plants expressing insect-specific Bt toxins. Fungi are ubiquitous natural entomopathogens that often cause epizootics in host insects and possess many desirable traits that favor their development as MCAs. Presently, commercialized microbial pesticides based on entomopathogenic fungi largely occupy niche markets. A variety of molecular tools and technologies have recently allowed reclassification of numerous species based on phylogeny, as well as matching anamorphs (asexual forms) and teleomorphs (sexual forms) of several entomopathogenic taxa in the Phylum Ascomycota. Although these fungi have been traditionally regarded exclusively as pathogens of arthropods, recent studies have demonstrated that they occupy a great diversity of ecological niches. Entomopathogenic fungi are now known to be plant endophytes, plant disease antagonists, rhizosphere colonizers, and plant growth promoters. These newly understood attributes provide possibilities to use fungi in multiple roles. In addition to arthropod pest control, some fungal species could simultaneously suppress plant pathogens and plant parasitic nematodes as well as promote plant growth. A greater understanding of fungal ecology is needed to define their roles in nature and evaluate their limitations in biological control. More efficient mass production, formulation and delivery systems must be devised to supply an ever increasing market. More testing under field conditions is required to identify effects of biotic and abiotic factors on efficacy and persistence. Lastly, greater attention must be paid to their use within integrated pest management programs; in particular, strategies that incorporate fungi in combination with arthropod predators and parasitoids need to be defined to ensure compatibility and maximize efficacy. Entomopathogenic nematodes (EPNs) in the genera Steinernema and Heterorhabditis are potent MCAs. Substantial progress in research and application of EPNs has been made in the past decade. The number of target pests shown to be susceptible to EPNs has continued to increase. Advancements in this regard primarily have been made in soil habitats where EPNs are shielded from environmental extremes, but progress has also been made in use of nematodes in above-ground habitats owing to the development of improved protective formulations. Progress has also resulted from advancements in nematode production technology using both in vivo and in vitro systems; novel application methods such as distribution of infected host cadavers; and nematode strain improvement via enhancement and stabilization of beneficial traits. Innovative research has also yielded insights into the fundamentals of EPN biology including major advances in genomics, nematode-bacterial symbiont interactions, ecological relationships, and foraging behavior. Additional research is needed to leverage these basic findings toward direct improvements in microbial control

    Fine-grained information extraction from German transthoracic echocardiography reports

    Get PDF
    Background Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of WĂŒrzburg. Methods A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts. Results The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout. Conclusions The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports with very high precision and high recall on the majority of documents at the University Hospital of WĂŒrzburg. Extracted results populate a clinical data warehouse which supports clinical research

    Conditional random fields for local adaptive reference extraction

    No full text
    The accurate extraction of bibliographic information from scientific publications is an active field of research. Machine learning, especially sequence labeling approaches like Conditional Random Fields (CRF), are often applied for this reference extraction task, but still suffer from the ambiguity of reference notation. Reference sections apply a predefined style guide and contain only homogeneous references. Therefore, other references of the same paper or journal often can provide evidence how the fields of a reference are correctly labeled. We propose a novel approach that exploits the similarities within a document. Our process model uses information of unlabeled documents directly during the extraction task in order to automatically adapt to the perceived style guide. This is implemented by changing the manifestation of the features for the applied CRF. The experimental results show considerable improvements compared to the common approach. We achieve an average F1 score of 96.7 % and an instance accuracy of 85.4 % on the test data set.
    • 

    corecore